{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PjRuzfwyImeC"
   },
   "source": [
    "# Tracing, Evaluating, and Profiling your Agent\n",
    "\n",
    "In this notebook, we will walk through the advanced capabilities of NVIDIA NeMo Agent toolkit (NAT) for <a href=\"https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/observe/index.html\"> observability</a>, <a href=\"https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/evaluate.html\">evaluation</a>, and <a href=\"https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/profiler.html\">profiling</a>, from setting up Phoenix tracing to running comprehensive workflow assessments and performance analysis."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Table of Contents\n",
    "\n",
    "- [0.0) Setup](#setup)\n",
    "  - [0.1) Prerequisites](#prereqs)\n",
    "  - [0.2) API Keys](#api-keys)\n",
    "  - [0.3) Data Sources](#data-sources)\n",
    "  - [0.4) Installing NeMo Agent Toolkit](#installing-nat)\n",
    "- [1.0) Creating a Tool-Calling Workflow](#creating-workflow)\n",
    "  - [1.1) Total Product Sales Tool](#product-sales-tool)\n",
    "  - [1.2) Sales Per Day Tool](#sales-per-day-tool)\n",
    "  - [1.3) Detect Outliers Tool](#detect-outliers-tool)\n",
    "  - [1.4) Technical Specs Retrieval Tool](#technical-specs-tool)\n",
    "  - [1.5) Data Analysis/Plotting Tools](#plotting-tools)\n",
    "  - [1.6) Register The Tools](#register-tools)\n",
    "  - [1.7) Workflow Configuration File](#workflow-config)\n",
    "  - [1.8) Testing/Verifying Workflow Installation](#verify-tools)\n",
    "- [2.0) Observing a Workflow with Phoenix](#observe-workflow)\n",
    "  - [2.1) Updating the Workflow Configuration For Telemetry](#update-config)\n",
    "  - [2.2) Start Phoenix Server](#start-phoenix)\n",
    "  - [2.3) Rerun the Workflow](#rerun-workflow)\n",
    "  - [2.4) Viewing the Trace](#view-trace)\n",
    "- [3.0) Evaluating a Workflow](#eval-workflow)\n",
    "  - [3.1) Create an Evaluation Dataset](#eval-dataset)\n",
    "  - [3.2) Updating the Workflow Configuration](#update-config-again)\n",
    "  - [3.3) Running the Evaluation](#run-eval)\n",
    "  - [3.4) Understanding Evaluation Results](#understand-eval)\n",
    "- [4.0) Profiling a Workflow](#profile-workflow)\n",
    "  - [4.1) Updating the Workflow Configuration](#update-profiling-workflow)\n",
    "  - [4.2) Understanding the Profiler Configuration](#understand-profiler-config)\n",
    "  - [4.3) Running the Profiler](#run-profiler)\n",
    "  - [4.4) Understanding Profiler Output Files](#understand-profiler-output-files)\n",
    "- [5.0) Notebook Summary](#summary)\n",
    "- [6.0) Next Steps](#next-steps)\n",
    "\n",
    "<span style=\"color:rgb(0, 31, 153); font-style: italic;\">Note: In Google Colab use the Table of Contents tab to navigate.</span>\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"setup\"></a>\n",
    "# 0.0) Setup\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "p4b2tXeEB5MH"
   },
   "source": [
    "<a id=\"prereqs\"></a>\n",
    "## 0.1) Prerequisites\n",
    "\n",
    "- **Platform:** Linux, macOS, or Windows\n",
    "- **Python:** version 3.11, 3.12, or 3.13\n",
    "- **Python Packages:** `pip`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "PzjU1lTaE3gW"
   },
   "source": [
    "<a id=\"api-keys\"></a>\n",
    "## 0.2) API Keys"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "3g2OD3D3TAuN"
   },
   "source": [
    "For this notebook, you will need the following API keys to run all examples end-to-end:\n",
    "\n",
    "- **NVIDIA Build:** You can obtain an NVIDIA Build API Key by creating an [NVIDIA Build](https://build.nvidia.com) account and generating a key at https://build.nvidia.com/settings/api-keys\n",
    "\n",
    "Then you can run the cell below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import getpass\n",
    "import os\n",
    "\n",
    "if \"NVIDIA_API_KEY\" not in os.environ:\n",
    "    nvidia_api_key = getpass.getpass(\"Enter your NVIDIA API key: \")\n",
    "    os.environ[\"NVIDIA_API_KEY\"] = nvidia_api_key"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"data-sources\"></a>\n",
    "## 0.3) Data Sources"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ELmZ_Pdz-qX7"
   },
   "source": [
    "Several data files are required for this example. To keep this as a stand-alone example, the files are included here as cells which can be run to create them.\n",
    "\n",
    "The following cell creates the `data` directory as well as a `rag` subdirectory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!mkdir -p data/rag"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following cell writes the `data/retail_sales_data.csv` file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form"
   },
   "outputs": [],
   "source": [
    "%%writefile data/retail_sales_data.csv\n",
    "Date,StoreID,Product,UnitsSold,Revenue,Promotion\n",
    "2024-01-01,S001,Laptop,1,1000,No\n",
    "2024-01-01,S001,Phone,9,4500,No\n",
    "2024-01-01,S001,Tablet,2,600,No\n",
    "2024-01-01,S002,Laptop,9,9000,No\n",
    "2024-01-01,S002,Phone,10,5000,No\n",
    "2024-01-01,S002,Tablet,5,1500,No\n",
    "2024-01-02,S001,Laptop,4,4000,No\n",
    "2024-01-02,S001,Phone,11,5500,No\n",
    "2024-01-02,S001,Tablet,7,2100,No\n",
    "2024-01-02,S002,Laptop,7,7000,No\n",
    "2024-01-02,S002,Phone,6,3000,No\n",
    "2024-01-02,S002,Tablet,9,2700,No\n",
    "2024-01-03,S001,Laptop,6,6000,No\n",
    "2024-01-03,S001,Phone,7,3500,No\n",
    "2024-01-03,S001,Tablet,8,2400,No\n",
    "2024-01-03,S002,Laptop,3,3000,No\n",
    "2024-01-03,S002,Phone,16,8000,No\n",
    "2024-01-03,S002,Tablet,5,1500,No\n",
    "2024-01-04,S001,Laptop,5,5000,No\n",
    "2024-01-04,S001,Phone,11,5500,No\n",
    "2024-01-04,S001,Tablet,9,2700,No\n",
    "2024-01-04,S002,Laptop,2,2000,No\n",
    "2024-01-04,S002,Phone,12,6000,No\n",
    "2024-01-04,S002,Tablet,7,2100,No\n",
    "2024-01-05,S001,Laptop,8,8000,No\n",
    "2024-01-05,S001,Phone,18,9000,No\n",
    "2024-01-05,S001,Tablet,5,1500,No\n",
    "2024-01-05,S002,Laptop,7,7000,No\n",
    "2024-01-05,S002,Phone,10,5000,No\n",
    "2024-01-05,S002,Tablet,10,3000,No\n",
    "2024-01-06,S001,Laptop,9,9000,No\n",
    "2024-01-06,S001,Phone,11,5500,No\n",
    "2024-01-06,S001,Tablet,5,1500,No\n",
    "2024-01-06,S002,Laptop,5,5000,No\n",
    "2024-01-06,S002,Phone,14,7000,No\n",
    "2024-01-06,S002,Tablet,10,3000,No\n",
    "2024-01-07,S001,Laptop,2,2000,No\n",
    "2024-01-07,S001,Phone,15,7500,No\n",
    "2024-01-07,S001,Tablet,6,1800,No\n",
    "2024-01-07,S002,Laptop,0,0,No\n",
    "2024-01-07,S002,Phone,7,3500,No\n",
    "2024-01-07,S002,Tablet,12,3600,No\n",
    "2024-01-08,S001,Laptop,5,5000,No\n",
    "2024-01-08,S001,Phone,8,4000,No\n",
    "2024-01-08,S001,Tablet,5,1500,No\n",
    "2024-01-08,S002,Laptop,4,4000,No\n",
    "2024-01-08,S002,Phone,11,5500,No\n",
    "2024-01-08,S002,Tablet,9,2700,No\n",
    "2024-01-09,S001,Laptop,6,6000,No\n",
    "2024-01-09,S001,Phone,9,4500,No\n",
    "2024-01-09,S001,Tablet,8,2400,No\n",
    "2024-01-09,S002,Laptop,7,7000,No\n",
    "2024-01-09,S002,Phone,11,5500,No\n",
    "2024-01-09,S002,Tablet,8,2400,No\n",
    "2024-01-10,S001,Laptop,6,6000,No\n",
    "2024-01-10,S001,Phone,11,5500,No\n",
    "2024-01-10,S001,Tablet,5,1500,No\n",
    "2024-01-10,S002,Laptop,8,8000,No\n",
    "2024-01-10,S002,Phone,5,2500,No\n",
    "2024-01-10,S002,Tablet,6,1800,No\n",
    "2024-01-11,S001,Laptop,5,5000,No\n",
    "2024-01-11,S001,Phone,7,3500,No\n",
    "2024-01-11,S001,Tablet,5,1500,No\n",
    "2024-01-11,S002,Laptop,4,4000,No\n",
    "2024-01-11,S002,Phone,10,5000,No\n",
    "2024-01-11,S002,Tablet,4,1200,No\n",
    "2024-01-12,S001,Laptop,2,2000,No\n",
    "2024-01-12,S001,Phone,10,5000,No\n",
    "2024-01-12,S001,Tablet,9,2700,No\n",
    "2024-01-12,S002,Laptop,8,8000,No\n",
    "2024-01-12,S002,Phone,10,5000,No\n",
    "2024-01-12,S002,Tablet,14,4200,No\n",
    "2024-01-13,S001,Laptop,3,3000,No\n",
    "2024-01-13,S001,Phone,6,3000,No\n",
    "2024-01-13,S001,Tablet,9,2700,No\n",
    "2024-01-13,S002,Laptop,1,1000,No\n",
    "2024-01-13,S002,Phone,12,6000,No\n",
    "2024-01-13,S002,Tablet,7,2100,No\n",
    "2024-01-14,S001,Laptop,4,4000,Yes\n",
    "2024-01-14,S001,Phone,16,8000,Yes\n",
    "2024-01-14,S001,Tablet,4,1200,Yes\n",
    "2024-01-14,S002,Laptop,5,5000,Yes\n",
    "2024-01-14,S002,Phone,14,7000,Yes\n",
    "2024-01-14,S002,Tablet,6,1800,Yes\n",
    "2024-01-15,S001,Laptop,9,9000,No\n",
    "2024-01-15,S001,Phone,6,3000,No\n",
    "2024-01-15,S001,Tablet,11,3300,No\n",
    "2024-01-15,S002,Laptop,5,5000,No\n",
    "2024-01-15,S002,Phone,10,5000,No\n",
    "2024-01-15,S002,Tablet,4,1200,No\n",
    "2024-01-16,S001,Laptop,6,6000,No\n",
    "2024-01-16,S001,Phone,11,5500,No\n",
    "2024-01-16,S001,Tablet,5,1500,No\n",
    "2024-01-16,S002,Laptop,4,4000,No\n",
    "2024-01-16,S002,Phone,7,3500,No\n",
    "2024-01-16,S002,Tablet,4,1200,No\n",
    "2024-01-17,S001,Laptop,6,6000,No\n",
    "2024-01-17,S001,Phone,14,7000,No\n",
    "2024-01-17,S001,Tablet,7,2100,No\n",
    "2024-01-17,S002,Laptop,3,3000,No\n",
    "2024-01-17,S002,Phone,7,3500,No\n",
    "2024-01-17,S002,Tablet,6,1800,No\n",
    "2024-01-18,S001,Laptop,7,7000,Yes\n",
    "2024-01-18,S001,Phone,10,5000,Yes\n",
    "2024-01-18,S001,Tablet,6,1800,Yes\n",
    "2024-01-18,S002,Laptop,5,5000,Yes\n",
    "2024-01-18,S002,Phone,16,8000,Yes\n",
    "2024-01-18,S002,Tablet,8,2400,Yes\n",
    "2024-01-19,S001,Laptop,4,4000,No\n",
    "2024-01-19,S001,Phone,12,6000,No\n",
    "2024-01-19,S001,Tablet,7,2100,No\n",
    "2024-01-19,S002,Laptop,3,3000,No\n",
    "2024-01-19,S002,Phone,12,6000,No\n",
    "2024-01-19,S002,Tablet,8,2400,No\n",
    "2024-01-20,S001,Laptop,6,6000,No\n",
    "2024-01-20,S001,Phone,8,4000,No\n",
    "2024-01-20,S001,Tablet,6,1800,No\n",
    "2024-01-20,S002,Laptop,8,8000,No\n",
    "2024-01-20,S002,Phone,9,4500,No\n",
    "2024-01-20,S002,Tablet,8,2400,No\n",
    "2024-01-21,S001,Laptop,3,3000,No\n",
    "2024-01-21,S001,Phone,9,4500,No\n",
    "2024-01-21,S001,Tablet,5,1500,No\n",
    "2024-01-21,S002,Laptop,8,8000,No\n",
    "2024-01-21,S002,Phone,15,7500,No\n",
    "2024-01-21,S002,Tablet,7,2100,No\n",
    "2024-01-22,S001,Laptop,1,1000,No\n",
    "2024-01-22,S001,Phone,15,7500,No\n",
    "2024-01-22,S001,Tablet,5,1500,No\n",
    "2024-01-22,S002,Laptop,11,11000,No\n",
    "2024-01-22,S002,Phone,4,2000,No\n",
    "2024-01-22,S002,Tablet,4,1200,No\n",
    "2024-01-23,S001,Laptop,3,3000,No\n",
    "2024-01-23,S001,Phone,8,4000,No\n",
    "2024-01-23,S001,Tablet,8,2400,No\n",
    "2024-01-23,S002,Laptop,6,6000,No\n",
    "2024-01-23,S002,Phone,12,6000,No\n",
    "2024-01-23,S002,Tablet,12,3600,No\n",
    "2024-01-24,S001,Laptop,2,2000,No\n",
    "2024-01-24,S001,Phone,14,7000,No\n",
    "2024-01-24,S001,Tablet,6,1800,No\n",
    "2024-01-24,S002,Laptop,1,1000,No\n",
    "2024-01-24,S002,Phone,5,2500,No\n",
    "2024-01-24,S002,Tablet,7,2100,No\n",
    "2024-01-25,S001,Laptop,7,7000,No\n",
    "2024-01-25,S001,Phone,11,5500,No\n",
    "2024-01-25,S001,Tablet,11,3300,No\n",
    "2024-01-25,S002,Laptop,6,6000,No\n",
    "2024-01-25,S002,Phone,11,5500,No\n",
    "2024-01-25,S002,Tablet,5,1500,No\n",
    "2024-01-26,S001,Laptop,5,5000,Yes\n",
    "2024-01-26,S001,Phone,22,11000,Yes\n",
    "2024-01-26,S001,Tablet,7,2100,Yes\n",
    "2024-01-26,S002,Laptop,6,6000,Yes\n",
    "2024-01-26,S002,Phone,24,12000,Yes\n",
    "2024-01-26,S002,Tablet,3,900,Yes\n",
    "2024-01-27,S001,Laptop,7,7000,Yes\n",
    "2024-01-27,S001,Phone,20,10000,Yes\n",
    "2024-01-27,S001,Tablet,6,1800,Yes\n",
    "2024-01-27,S002,Laptop,4,4000,Yes\n",
    "2024-01-27,S002,Phone,8,4000,Yes\n",
    "2024-01-27,S002,Tablet,6,1800,Yes\n",
    "2024-01-28,S001,Laptop,10,10000,No\n",
    "2024-01-28,S001,Phone,15,7500,No\n",
    "2024-01-28,S001,Tablet,12,3600,No\n",
    "2024-01-28,S002,Laptop,6,6000,No\n",
    "2024-01-28,S002,Phone,11,5500,No\n",
    "2024-01-28,S002,Tablet,10,3000,No\n",
    "2024-01-29,S001,Laptop,3,3000,No\n",
    "2024-01-29,S001,Phone,16,8000,No\n",
    "2024-01-29,S001,Tablet,5,1500,No\n",
    "2024-01-29,S002,Laptop,6,6000,No\n",
    "2024-01-29,S002,Phone,17,8500,No\n",
    "2024-01-29,S002,Tablet,2,600,No\n",
    "2024-01-30,S001,Laptop,3,3000,No\n",
    "2024-01-30,S001,Phone,11,5500,No\n",
    "2024-01-30,S001,Tablet,2,600,No\n",
    "2024-01-30,S002,Laptop,6,6000,No\n",
    "2024-01-30,S002,Phone,16,8000,No\n",
    "2024-01-30,S002,Tablet,8,2400,No\n",
    "2024-01-31,S001,Laptop,5,5000,Yes\n",
    "2024-01-31,S001,Phone,22,11000,Yes\n",
    "2024-01-31,S001,Tablet,9,2700,Yes\n",
    "2024-01-31,S002,Laptop,3,3000,Yes\n",
    "2024-01-31,S002,Phone,14,7000,Yes\n",
    "2024-01-31,S002,Tablet,4,1200,Yes\n",
    "2024-02-01,S001,Laptop,2,2000,No\n",
    "2024-02-01,S001,Phone,7,3500,No\n",
    "2024-02-01,S001,Tablet,11,3300,No\n",
    "2024-02-01,S002,Laptop,6,6000,No\n",
    "2024-02-01,S002,Phone,11,5500,No\n",
    "2024-02-01,S002,Tablet,5,1500,No\n",
    "2024-02-02,S001,Laptop,2,2000,No\n",
    "2024-02-02,S001,Phone,9,4500,No\n",
    "2024-02-02,S001,Tablet,7,2100,No\n",
    "2024-02-02,S002,Laptop,5,5000,No\n",
    "2024-02-02,S002,Phone,9,4500,No\n",
    "2024-02-02,S002,Tablet,12,3600,No\n",
    "2024-02-03,S001,Laptop,9,9000,No\n",
    "2024-02-03,S001,Phone,12,6000,No\n",
    "2024-02-03,S001,Tablet,9,2700,No\n",
    "2024-02-03,S002,Laptop,10,10000,No\n",
    "2024-02-03,S002,Phone,6,3000,No\n",
    "2024-02-03,S002,Tablet,10,3000,No\n",
    "2024-02-04,S001,Laptop,6,6000,No\n",
    "2024-02-04,S001,Phone,5,2500,No\n",
    "2024-02-04,S001,Tablet,8,2400,No\n",
    "2024-02-04,S002,Laptop,6,6000,No\n",
    "2024-02-04,S002,Phone,10,5000,No\n",
    "2024-02-04,S002,Tablet,10,3000,No\n",
    "2024-02-05,S001,Laptop,7,7000,No\n",
    "2024-02-05,S001,Phone,13,6500,No\n",
    "2024-02-05,S001,Tablet,11,3300,No\n",
    "2024-02-05,S002,Laptop,8,8000,No\n",
    "2024-02-05,S002,Phone,11,5500,No\n",
    "2024-02-05,S002,Tablet,8,2400,No\n",
    "2024-02-06,S001,Laptop,5,5000,No\n",
    "2024-02-06,S001,Phone,14,7000,No\n",
    "2024-02-06,S001,Tablet,4,1200,No\n",
    "2024-02-06,S002,Laptop,2,2000,No\n",
    "2024-02-06,S002,Phone,11,5500,No\n",
    "2024-02-06,S002,Tablet,7,2100,No\n",
    "2024-02-07,S001,Laptop,6,6000,No\n",
    "2024-02-07,S001,Phone,7,3500,No\n",
    "2024-02-07,S001,Tablet,9,2700,No\n",
    "2024-02-07,S002,Laptop,2,2000,No\n",
    "2024-02-07,S002,Phone,8,4000,No\n",
    "2024-02-07,S002,Tablet,9,2700,No\n",
    "2024-02-08,S001,Laptop,5,5000,No\n",
    "2024-02-08,S001,Phone,12,6000,No\n",
    "2024-02-08,S001,Tablet,3,900,No\n",
    "2024-02-08,S002,Laptop,8,8000,No\n",
    "2024-02-08,S002,Phone,5,2500,No\n",
    "2024-02-08,S002,Tablet,8,2400,No\n",
    "2024-02-09,S001,Laptop,6,6000,Yes\n",
    "2024-02-09,S001,Phone,18,9000,Yes\n",
    "2024-02-09,S001,Tablet,5,1500,Yes\n",
    "2024-02-09,S002,Laptop,7,7000,Yes\n",
    "2024-02-09,S002,Phone,18,9000,Yes\n",
    "2024-02-09,S002,Tablet,5,1500,Yes\n",
    "2024-02-10,S001,Laptop,9,9000,No\n",
    "2024-02-10,S001,Phone,6,3000,No\n",
    "2024-02-10,S001,Tablet,8,2400,No\n",
    "2024-02-10,S002,Laptop,7,7000,No\n",
    "2024-02-10,S002,Phone,5,2500,No\n",
    "2024-02-10,S002,Tablet,6,1800,No\n",
    "2024-02-11,S001,Laptop,6,6000,No\n",
    "2024-02-11,S001,Phone,11,5500,No\n",
    "2024-02-11,S001,Tablet,2,600,No\n",
    "2024-02-11,S002,Laptop,7,7000,No\n",
    "2024-02-11,S002,Phone,5,2500,No\n",
    "2024-02-11,S002,Tablet,9,2700,No\n",
    "2024-02-12,S001,Laptop,5,5000,No\n",
    "2024-02-12,S001,Phone,5,2500,No\n",
    "2024-02-12,S001,Tablet,4,1200,No\n",
    "2024-02-12,S002,Laptop,1,1000,No\n",
    "2024-02-12,S002,Phone,14,7000,No\n",
    "2024-02-12,S002,Tablet,15,4500,No\n",
    "2024-02-13,S001,Laptop,3,3000,No\n",
    "2024-02-13,S001,Phone,18,9000,No\n",
    "2024-02-13,S001,Tablet,8,2400,No\n",
    "2024-02-13,S002,Laptop,5,5000,No\n",
    "2024-02-13,S002,Phone,8,4000,No\n",
    "2024-02-13,S002,Tablet,6,1800,No\n",
    "2024-02-14,S001,Laptop,4,4000,No\n",
    "2024-02-14,S001,Phone,9,4500,No\n",
    "2024-02-14,S001,Tablet,6,1800,No\n",
    "2024-02-14,S002,Laptop,4,4000,No\n",
    "2024-02-14,S002,Phone,6,3000,No\n",
    "2024-02-14,S002,Tablet,7,2100,No\n",
    "2024-02-15,S001,Laptop,4,4000,Yes\n",
    "2024-02-15,S001,Phone,26,13000,Yes\n",
    "2024-02-15,S001,Tablet,5,1500,Yes\n",
    "2024-02-15,S002,Laptop,2,2000,Yes\n",
    "2024-02-15,S002,Phone,14,7000,Yes\n",
    "2024-02-15,S002,Tablet,6,1800,Yes\n",
    "2024-02-16,S001,Laptop,7,7000,No\n",
    "2024-02-16,S001,Phone,9,4500,No\n",
    "2024-02-16,S001,Tablet,1,300,No\n",
    "2024-02-16,S002,Laptop,6,6000,No\n",
    "2024-02-16,S002,Phone,12,6000,No\n",
    "2024-02-16,S002,Tablet,10,3000,No\n",
    "2024-02-17,S001,Laptop,5,5000,No\n",
    "2024-02-17,S001,Phone,8,4000,No\n",
    "2024-02-17,S001,Tablet,14,4200,No\n",
    "2024-02-17,S002,Laptop,4,4000,No\n",
    "2024-02-17,S002,Phone,13,6500,No\n",
    "2024-02-17,S002,Tablet,7,2100,No\n",
    "2024-02-18,S001,Laptop,6,6000,Yes\n",
    "2024-02-18,S001,Phone,22,11000,Yes\n",
    "2024-02-18,S001,Tablet,9,2700,Yes\n",
    "2024-02-18,S002,Laptop,2,2000,Yes\n",
    "2024-02-18,S002,Phone,10,5000,Yes\n",
    "2024-02-18,S002,Tablet,12,3600,Yes\n",
    "2024-02-19,S001,Laptop,6,6000,No\n",
    "2024-02-19,S001,Phone,12,6000,No\n",
    "2024-02-19,S001,Tablet,3,900,No\n",
    "2024-02-19,S002,Laptop,3,3000,No\n",
    "2024-02-19,S002,Phone,4,2000,No\n",
    "2024-02-19,S002,Tablet,7,2100,No\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The following cell writes the RAG product catalog file, `data/rag/product_catalog.md`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form"
   },
   "outputs": [],
   "source": [
    "%%writefile data/rag/product_catalog.md\n",
    "# Product Catalog: Smartphones, Laptops, and Tablets\n",
    "\n",
    "## Smartphones\n",
    "\n",
    "The Veltrix Solis Z9 is a flagship device in the premium smartphone segment. It builds on a decade of design iterations that prioritize screen-to-body ratio, minimal bezels, and high refresh rate displays. The 6.7-inch AMOLED panel with 120Hz refresh rate delivers immersive visual experiences, whether in gaming, video streaming, or augmented reality applications. The display's GorillaGlass Fusion coating provides scratch resistance and durability, and the thin form factor is engineered using a titanium-aluminum alloy chassis to reduce weight without compromising rigidity.\n",
    "\n",
    "Internally, the Solis Z9 is powered by the OrionEdge V14 chipset, a 4nm process SoC designed for high-efficiency workloads. Its AI accelerator module handles on-device tasks such as voice transcription, camera optimization, and intelligent background app management. The inclusion of 12GB LPDDR5 RAM and a 256GB UFS 3.1 storage system allows for seamless multitasking, instant app launching, and rapid data access. The device supports eSIM and dual physical SIM configurations, catering to global travelers and hybrid network users.\n",
    "\n",
    "Photography and videography are central to the Solis Z9 experience. The triple-camera system incorporates a periscope-style 8MP telephoto lens with 5x optical zoom, a 12MP ultra-wide sensor with macro capabilities, and a 64MP main sensor featuring optical image stabilization (OIS) and phase detection autofocus (PDAF). Night mode and HDRX+ processing enable high-fidelity image capture in challenging lighting conditions.\n",
    "\n",
    "Software-wise, the device ships with LunOS 15, a lightweight Android fork optimized for modular updates and privacy compliance. The system supports secure containers for work profiles and AI-powered notifications that summarize app alerts across channels. Facial unlock is augmented by a 3D IR depth sensor, providing reliable biometric security alongside the ultrasonic in-display fingerprint scanner.\n",
    "\n",
    "The Solis Z9 is a culmination of over a decade of design experimentation in mobile form factors, ranging from curved-edge screens to under-display camera arrays. Its balance of performance, battery efficiency, and user-centric software makes it an ideal daily driver for content creators, mobile gamers, and enterprise users.\n",
    "\n",
    "## Laptops\n",
    "\n",
    "The Cryon Vanta 16X represents the latest evolution of portable computing power tailored for professional-grade workloads.\n",
    "\n",
    "The Vanta 16X features a unibody chassis milled from aircraft-grade aluminum using CNC machining. The thermal design integrates vapor chamber cooling and dual-fan exhaust architecture to support sustained performance under high computational loads. The 16-inch 4K UHD display is color-calibrated at the factory and supports HDR10+, making it suitable for cinematic video editing and high-fidelity CAD modeling.\n",
    "\n",
    "Powering the device is Intel's Core i9-13900H processor, which includes 14 cores with a hybrid architecture combining performance and efficiency cores. This allows the system to dynamically balance power consumption and raw speed based on active workloads. The dedicated Zephira RTX 4700G GPU features 8GB of GDDR6 VRAM and is optimized for CUDA and Tensor Core operations, enabling applications in real-time ray tracing, AI inference, and 3D rendering.\n",
    "\n",
    "The Vanta 16X includes a 2TB PCIe Gen 4 NVMe SSD, delivering sequential read/write speeds above 7GB/s, and 32GB of high-bandwidth DDR5 RAM. The machine supports hardware-accelerated virtualization and dual-booting, and ships with VireoOS Pro pre-installed, with official drivers available for Fedora, Ubuntu LTS, and NebulaOS.\n",
    "\n",
    "Input options are expansive. The keyboard features per-key RGB lighting and programmable macros, while the haptic touchpad supports multi-gesture navigation and palm rejection. Port variety includes dual Thunderbolt 4 ports, a full-size SD Express card reader, HDMI 2.1, 2.5G Ethernet, three USB-A 3.2 ports, and a 3.5mm TRRS audio jack. A fingerprint reader is embedded in the power button and supports biometric logins via Windows Hello.\n",
    "\n",
    "The history of the Cryon laptop line dates back to the early 2010s, when the company launched its first ultrabook aimed at mobile developers. Since then, successive generations have introduced carbon fiber lids, modular SSD bays, and convertible form factors. The Vanta 16X continues this tradition by integrating a customizable BIOS, a modular fan assembly, and a trackpad optimized for creative software like Blender and Adobe Creative Suite.\n",
    "\n",
    "Designed for software engineers, data scientists, film editors, and 3D artists, the Cryon Vanta 16X is a workstation-class laptop in a portable shell.\n",
    "\n",
    "## Tablets\n",
    "\n",
    "The Nebulyn Ark S12 Ultra reflects the current apex of tablet technology, combining high-end hardware with software environments tailored for productivity and creativity.\n",
    "\n",
    "The Ark S12 Ultra is built around a 12.9-inch OLED display that supports 144Hz refresh rate and HDR10+ dynamic range. With a resolution of 2800 x 1752 pixels and a contrast ratio of 1,000,000:1, the screen delivers vibrant color reproduction ideal for design and media consumption. The display supports true tone adaptation and low blue-light filtering for prolonged use.\n",
    "\n",
    "Internally, the tablet uses Qualcomm's Snapdragon 8 Gen 3 SoC, which includes an Adreno 750 GPU and an NPU for on-device AI tasks. The device ships with 16GB LPDDR5X RAM and 512GB of storage with support for NVMe expansion via a proprietary magnetic dock. The 11200mAh battery enables up to 15 hours of typical use and recharges to 80 percent in 45 minutes via 45W USB-C PD.\n",
    "\n",
    "The Ark's history traces back to the original Nebulyn Tab, which launched in 2014 as an e-reader and video streaming device. Since then, the line has evolved through multiple iterations that introduced stylus support, high-refresh screens, and multi-window desktop modes. The current model supports NebulynVerse, a DeX-like environment that allows external display mirroring and full multitasking with overlapping windows and keyboard shortcuts.\n",
    "\n",
    "Input capabilities are central to the Ark S12 Ultra’s appeal. The Pluma Stylus 3 features magnetic charging, 4096 pressure levels, and tilt detection. It integrates haptic feedback to simulate traditional pen strokes and brush textures. The device also supports a SnapCover keyboard that includes a trackpad and programmable shortcut keys. With the stylus and keyboard, users can effectively transform the tablet into a mobile workstation or digital sketchbook.\n",
    "\n",
    "Camera hardware includes a 13MP main sensor and a 12MP ultra-wide front camera with center-stage tracking and biometric unlock. Microphone arrays with beamforming enable studio-quality call audio. Connectivity includes Wi-Fi 7, Bluetooth 5.3, and optional LTE/5G with eSIM.\n",
    "\n",
    "Software support is robust. The device runs NebulynOS 6.0, based on Android 14L, and supports app sandboxing, multi-user profiles, and remote device management. Integration with cloud services, including SketchNimbus and ThoughtSpace, allows for real-time collaboration and syncing of content across devices.\n",
    "\n",
    "This tablet is targeted at professionals who require a balance between media consumption, creativity, and light productivity. Typical users include architects, consultants, university students, and UX designers.\n",
    "\n",
    "## Comparative Summary\n",
    "\n",
    "Each of these devices—the Veltrix Solis Z9, Cryon Vanta 16X, and Nebulyn Ark S12 Ultra—represents a best-in-class interpretation of its category. The Solis Z9 excels in mobile photography and everyday communication. The Vanta 16X is tailored for high-performance applications such as video production and AI prototyping. The Ark S12 Ultra provides a canvas for creativity, note-taking, and hybrid productivity use cases.\n",
    "\n",
    "## Historical Trends and Design Evolution\n",
    "\n",
    "Design across all three categories is converging toward modularity, longevity, and environmental sustainability. Recycled materials, reparability scores, and software longevity are becoming integral to brand reputation and product longevity. Future iterations are expected to feature tighter integration with wearable devices, ambient AI experiences, and cross-device workflows."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0IUUGtXSFB5G"
   },
   "source": [
    "<a id=\"installing-nat\"></a>\n",
    "## 0.4) Installing NeMo Agent Toolkit"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "OSICVNHGGm9l"
   },
   "source": [
    "The recommended way to install NAT is through `pip` or `uv pip`.\n",
    "\n",
    "First, we will install `uv` which offers parallel downloads and faster dependency resolution."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install uv"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "EBV2Gh9NIC8R"
   },
   "source": [
    "NeMo Agent toolkit can be installed through the PyPI `nvidia-nat` package.\n",
    "\n",
    "There are several optional subpackages available for NAT. For this example, we will rely on three subpackages:\n",
    "* The `langchain` subpackage contains useful components for integrating and running within [LangChain](https://python.langchain.com/docs/introduction/).\n",
    "* The `llama-index` subpackage contains useful components for integrating and running within [LlamaIndex](https://developers.llamaindex.ai/python/framework/).\n",
    "* The `phoenix` subpackage contains components for integrating with [Phoenix](https://phoenix.arize.com/).\n",
    "* The `profiling` subpackage contains components common for profiling with NeMo Agent toolkit."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash\n",
    "uv pip show -q \"nvidia-nat-langchain\"\n",
    "nat_langchain_installed=$?\n",
    "uv pip show -q \"nvidia-nat-llama-index\"\n",
    "nat_llama_index_installed=$?\n",
    "uv pip show -q \"nvidia-nat-phoenix\"\n",
    "nat_phoenix_installed=$?\n",
    "uv pip show -q \"nvidia-nat-profiling\"\n",
    "nat_profiling_installed=$?\n",
    "if [[ ${nat_langchain_installed} -ne 0 || ${nat_llama_index_installed} -ne 0 || ${nat_phoenix_installed} -ne 0 || ${nat_profiling_installed} -ne 0 ]]; then\n",
    "    uv pip install \"nvidia-nat[langchain,llama-index,phoenix,profiling]\"\n",
    "else\n",
    "    echo \"nvidia-nat[langchain,llama-index,phoenix,profiling] is already installed\"\n",
    "fi"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qrl3St-WWBQ2"
   },
   "source": [
    "<a id=\"creating-workflow\"></a>\n",
    "# 1.0) Creating a Tool-Calling Workflow\n",
    "\n",
    "In the previous notebook we went through a complex multi-agent example with several new tools. If you already have the example installed, you can skip this section."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!nat workflow create retail_sales_agent_nb6"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iSDMOrSQKtBr"
   },
   "source": [
    "The following cells adding additional tools to the workflow and register them.\n",
    "\n",
    "* Sales Per Day Tool\n",
    "* Detect Outliers Tool\n",
    "* Total Product Sales Data Tool\n",
    "* LlamaIndex RAG Tool\n",
    "* Data Visualization Tools\n",
    "* Tool Registration"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"product-sales-tool\"></a>\n",
    "## 1.1) Total Product Sales Tool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form"
   },
   "outputs": [],
   "source": [
    "%%writefile retail_sales_agent_nb6/src/retail_sales_agent_nb6/total_product_sales_data_tool.py\n",
    "from pydantic import Field\n",
    "\n",
    "from nat.builder.builder import Builder\n",
    "from nat.builder.framework_enum import LLMFrameworkEnum\n",
    "from nat.builder.function_info import FunctionInfo\n",
    "from nat.cli.register_workflow import register_function\n",
    "from nat.data_models.function import FunctionBaseConfig\n",
    "\n",
    "\n",
    "class GetTotalProductSalesDataConfig(FunctionBaseConfig, name=\"get_total_product_sales_data\"):\n",
    "    \"\"\"Get total sales data by product.\"\"\"\n",
    "    data_path: str = Field(description=\"Path to the data file\")\n",
    "\n",
    "\n",
    "@register_function(config_type=GetTotalProductSalesDataConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n",
    "async def get_total_product_sales_data_function(config: GetTotalProductSalesDataConfig, _builder: Builder):\n",
    "    \"\"\"Get total sales data for a specific product.\"\"\"\n",
    "    import pandas as pd\n",
    "\n",
    "    df = pd.read_csv(config.data_path)\n",
    "\n",
    "    async def _get_total_product_sales_data(product_name: str) -> str:\n",
    "        \"\"\"\n",
    "        Retrieve total sales data for a specific product.\n",
    "\n",
    "        Args:\n",
    "            product_name: Name of the product\n",
    "\n",
    "        Returns:\n",
    "            String message containing total sales data\n",
    "        \"\"\"\n",
    "        df['Product'] = df[\"Product\"].apply(lambda x: x.lower())\n",
    "        revenue = df[df['Product'] == product_name]['Revenue'].sum()\n",
    "        units_sold = df[df['Product'] == product_name]['UnitsSold'].sum()\n",
    "\n",
    "        return f\"Revenue for {product_name} are {revenue} and total units sold are {units_sold}\"\n",
    "\n",
    "    yield FunctionInfo.from_fn(\n",
    "        _get_total_product_sales_data,\n",
    "        description=_get_total_product_sales_data.__doc__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"sales-per-day-tool\"></a>\n",
    "## 1.2) Sales Per Day Tool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form"
   },
   "outputs": [],
   "source": [
    "%%writefile retail_sales_agent_nb6/src/retail_sales_agent_nb6/sales_per_day_tool.py\n",
    "from pydantic import Field\n",
    "\n",
    "from nat.builder.builder import Builder\n",
    "from nat.builder.framework_enum import LLMFrameworkEnum\n",
    "from nat.builder.function_info import FunctionInfo\n",
    "from nat.cli.register_workflow import register_function\n",
    "from nat.data_models.function import FunctionBaseConfig\n",
    "\n",
    "\n",
    "class GetSalesPerDayConfig(FunctionBaseConfig, name=\"get_sales_per_day\"):\n",
    "    \"\"\"Get total sales across all products per day.\"\"\"\n",
    "    data_path: str = Field(description=\"Path to the data file\")\n",
    "\n",
    "\n",
    "@register_function(config_type=GetSalesPerDayConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n",
    "async def sales_per_day_function(config: GetSalesPerDayConfig, builder: Builder):\n",
    "    \"\"\"Get total sales across all products per day.\"\"\"\n",
    "    import pandas as pd\n",
    "\n",
    "    df = pd.read_csv(config.data_path)\n",
    "    df['Product'] = df[\"Product\"].apply(lambda x: x.lower())\n",
    "\n",
    "    async def _get_sales_per_day(date: str, product: str) -> str:\n",
    "        \"\"\"\n",
    "        Calculate total sales data across all products for a specific date.\n",
    "\n",
    "        Args:\n",
    "            date: Date in YYYY-MM-DD format\n",
    "            product: Product name\n",
    "\n",
    "        Returns:\n",
    "            String message with the total sales for the day\n",
    "        \"\"\"\n",
    "        if date == \"None\":\n",
    "            return \"Please provide a date in YYYY-MM-DD format.\"\n",
    "        total_revenue = df[(df['Date'] == date) & (df['Product'] == product)]['Revenue'].sum()\n",
    "        total_units_sold = df[(df['Date'] == date) & (df['Product'] == product)]['UnitsSold'].sum()\n",
    "\n",
    "        return f\"Total revenue for {date} is {total_revenue} and total units sold is {total_units_sold}\"\n",
    "\n",
    "    yield FunctionInfo.from_fn(\n",
    "        _get_sales_per_day,\n",
    "        description=_get_sales_per_day.__doc__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"detect-outliers-tool\"></a>\n",
    "## 1.3) Detect Outliers Tool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form"
   },
   "outputs": [],
   "source": [
    "%%writefile retail_sales_agent_nb6/src/retail_sales_agent_nb6/detect_outliers_tool.py\n",
    "from pydantic import Field\n",
    "\n",
    "from nat.builder.builder import Builder\n",
    "from nat.builder.framework_enum import LLMFrameworkEnum\n",
    "from nat.builder.function_info import FunctionInfo\n",
    "from nat.cli.register_workflow import register_function\n",
    "from nat.data_models.function import FunctionBaseConfig\n",
    "\n",
    "\n",
    "class DetectOutliersIQRConfig(FunctionBaseConfig, name=\"detect_outliers_iqr\"):\n",
    "    \"\"\"Detect outliers in sales data using IQR method.\"\"\"\n",
    "    data_path: str = Field(description=\"Path to the data file\")\n",
    "\n",
    "\n",
    "@register_function(config_type=DetectOutliersIQRConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n",
    "async def detect_outliers_iqr_function(config: DetectOutliersIQRConfig, _builder: Builder):\n",
    "    \"\"\"Detect outliers in sales data using the Interquartile Range (IQR) method.\"\"\"\n",
    "    import pandas as pd\n",
    "\n",
    "    df = pd.read_csv(config.data_path)\n",
    "\n",
    "    async def _detect_outliers_iqr(metric: str) -> str:\n",
    "        \"\"\"\n",
    "        Detect outliers in retail data using the IQR method.\n",
    "\n",
    "        Args:\n",
    "            metric: Specific metric to check for outliers\n",
    "\n",
    "        Returns:\n",
    "            Dictionary containing outlier analysis results\n",
    "        \"\"\"\n",
    "        if metric == \"None\":\n",
    "            column = \"Revenue\"\n",
    "        else:\n",
    "            column = metric\n",
    "\n",
    "        q1 = df[column].quantile(0.25)\n",
    "        q3 = df[column].quantile(0.75)\n",
    "        iqr = q3 - q1\n",
    "        outliers = df[(df[column] < q1 - 1.5 * iqr) | (df[column] > q3 + 1.5 * iqr)]\n",
    "\n",
    "        return f\"Outliers in {column} are {outliers.to_dict('records')}\"\n",
    "\n",
    "    yield FunctionInfo.from_fn(\n",
    "        _detect_outliers_iqr,\n",
    "        description=_detect_outliers_iqr.__doc__)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"technical-specs-tool\"></a>\n",
    "## 1.4) Technical Specs Retrieval Tool"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form"
   },
   "outputs": [],
   "source": [
    "%%writefile retail_sales_agent_nb6/src/retail_sales_agent_nb6/retail_sales_rag_tool.py\n",
    "import logging\n",
    "import os\n",
    "\n",
    "from pydantic import Field\n",
    "\n",
    "from nat.builder.builder import Builder\n",
    "from nat.builder.framework_enum import LLMFrameworkEnum\n",
    "from nat.builder.function_info import FunctionInfo\n",
    "from nat.cli.register_workflow import register_function\n",
    "from nat.data_models.component_ref import EmbedderRef\n",
    "from nat.data_models.component_ref import LLMRef\n",
    "from nat.data_models.function import FunctionBaseConfig\n",
    "\n",
    "logger = logging.getLogger(__name__)\n",
    "\n",
    "\n",
    "class LlamaIndexRAGConfig(FunctionBaseConfig, name=\"retail_sales_rag\"):\n",
    "\n",
    "    llm_name: LLMRef = Field(description=\"The name of the LLM to use for the RAG engine.\")\n",
    "    embedder_name: EmbedderRef = Field(description=\"The name of the embedder to use for the RAG engine.\")\n",
    "    data_dir: str = Field(description=\"The directory containing the data to use for the RAG engine.\")\n",
    "    description: str = Field(description=\"A description of the knowledge included in the RAG system.\")\n",
    "    collection_name: str = Field(default=\"context\", description=\"The name of the collection to use for the RAG engine.\")\n",
    "\n",
    "\n",
    "def _walk_directory(root: str):\n",
    "    for root, dirs, files in os.walk(root):\n",
    "        for file_name in files:\n",
    "            yield os.path.join(root, file_name)\n",
    "\n",
    "\n",
    "@register_function(config_type=LlamaIndexRAGConfig, framework_wrappers=[LLMFrameworkEnum.LLAMA_INDEX])\n",
    "async def retail_sales_rag_tool(config: LlamaIndexRAGConfig, builder: Builder):\n",
    "    from llama_index.core import Settings\n",
    "    from llama_index.core import SimpleDirectoryReader\n",
    "    from llama_index.core import StorageContext\n",
    "    from llama_index.core import VectorStoreIndex\n",
    "    from llama_index.core.node_parser import SentenceSplitter\n",
    "\n",
    "    llm = await builder.get_llm(config.llm_name, wrapper_type=LLMFrameworkEnum.LLAMA_INDEX)\n",
    "    embedder = await builder.get_embedder(config.embedder_name, wrapper_type=LLMFrameworkEnum.LLAMA_INDEX)\n",
    "\n",
    "    Settings.embed_model = embedder\n",
    "    Settings.llm = llm\n",
    "\n",
    "    files = list(_walk_directory(config.data_dir))\n",
    "    docs = SimpleDirectoryReader(input_files=files).load_data()\n",
    "    logger.info(\"Loaded %s documents from %s\", len(docs), config.data_dir)\n",
    "\n",
    "    parser = SentenceSplitter(\n",
    "        chunk_size=400,\n",
    "        chunk_overlap=20,\n",
    "        separator=\" \",\n",
    "    )\n",
    "    nodes = parser.get_nodes_from_documents(docs)\n",
    "\n",
    "    index = VectorStoreIndex(nodes)\n",
    "\n",
    "    query_engine = index.as_query_engine(similarity_top_k=3, )\n",
    "\n",
    "    async def _arun(inputs: str) -> str:\n",
    "        \"\"\"\n",
    "        Search product catalog for information about tablets, laptops, and smartphones\n",
    "        Args:\n",
    "            inputs: user query about product specifications\n",
    "        \"\"\"\n",
    "        try:\n",
    "            response = query_engine.query(inputs)\n",
    "            return str(response.response)\n",
    "\n",
    "        except Exception as e:\n",
    "            logger.error(\"RAG query failed: %s\", e)\n",
    "            return f\"Sorry, I couldn't retrieve information about that product. Error: {str(e)}\"\n",
    "\n",
    "    yield FunctionInfo.from_fn(_arun, description=config.description)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"plotting-tools\"></a>\n",
    "## 1.5) Data Analysis/Plotting Tools\n",
    "\n",
    "This is a new set of tools that will be registered to the data analysis and plotting agent. This set of tools allows the registered agent to plot the results of upstream data analysis tasks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form"
   },
   "outputs": [],
   "source": [
    "%%writefile retail_sales_agent_nb6/src/retail_sales_agent_nb6/data_visualization_tools.py\n",
    "from pydantic import Field\n",
    "\n",
    "from nat.builder.builder import Builder\n",
    "from nat.builder.framework_enum import LLMFrameworkEnum\n",
    "from nat.builder.function_info import FunctionInfo\n",
    "from nat.cli.register_workflow import register_function\n",
    "from nat.data_models.component_ref import LLMRef\n",
    "from nat.data_models.function import FunctionBaseConfig\n",
    "\n",
    "\n",
    "class PlotSalesTrendForStoresConfig(FunctionBaseConfig, name=\"plot_sales_trend_for_stores\"):\n",
    "    \"\"\"Plot sales trend for a specific store.\"\"\"\n",
    "    data_path: str = Field(description=\"Path to the data file\")\n",
    "\n",
    "\n",
    "@register_function(config_type=PlotSalesTrendForStoresConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n",
    "async def plot_sales_trend_for_stores_function(config: PlotSalesTrendForStoresConfig, _builder: Builder):\n",
    "    \"\"\"Create a visualization of sales trends over time.\"\"\"\n",
    "    import matplotlib.pyplot as plt\n",
    "    import pandas as pd\n",
    "\n",
    "    df = pd.read_csv(config.data_path)\n",
    "\n",
    "    async def _plot_sales_trend_for_stores(store_id: str) -> str:\n",
    "        if store_id not in df[\"StoreID\"].unique():\n",
    "            data = df\n",
    "            title = \"Sales Trend for All Stores\"\n",
    "        else:\n",
    "            data = df[df[\"StoreID\"] == store_id]\n",
    "            title = f\"Sales Trend for Store {store_id}\"\n",
    "\n",
    "        plt.figure(figsize=(10, 5))\n",
    "        trend = data.groupby(\"Date\")[\"Revenue\"].sum()\n",
    "        trend.plot(title=title)\n",
    "        plt.xlabel(\"Date\")\n",
    "        plt.ylabel(\"Revenue\")\n",
    "        plt.tight_layout()\n",
    "        plt.savefig(\"sales_trend.png\")\n",
    "\n",
    "        return \"Sales trend plot saved to sales_trend.png\"\n",
    "\n",
    "    yield FunctionInfo.from_fn(\n",
    "        _plot_sales_trend_for_stores,\n",
    "        description=(\n",
    "            \"This tool can be used to plot the sales trend for a specific store or all stores. \"\n",
    "            \"It takes in a store ID creates and saves an image of a plot of the revenue trend for that store.\"))\n",
    "\n",
    "\n",
    "class PlotAndCompareRevenueAcrossStoresConfig(FunctionBaseConfig, name=\"plot_and_compare_revenue_across_stores\"):\n",
    "    \"\"\"Plot and compare revenue across stores.\"\"\"\n",
    "    data_path: str = Field(description=\"Path to the data file\")\n",
    "\n",
    "\n",
    "@register_function(config_type=PlotAndCompareRevenueAcrossStoresConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n",
    "async def plot_revenue_across_stores_function(config: PlotAndCompareRevenueAcrossStoresConfig, _builder: Builder):\n",
    "    \"\"\"Create a visualization comparing sales trends between stores.\"\"\"\n",
    "    import matplotlib.pyplot as plt\n",
    "    import pandas as pd\n",
    "\n",
    "    df = pd.read_csv(config.data_path)\n",
    "\n",
    "    async def _plot_revenue_across_stores(arg: str) -> str:\n",
    "        pivot = df.pivot_table(index=\"Date\", columns=\"StoreID\", values=\"Revenue\", aggfunc=\"sum\")\n",
    "        pivot.plot(figsize=(12, 6), title=\"Revenue Trends Across Stores\")\n",
    "        plt.xlabel(\"Date\")\n",
    "        plt.ylabel(\"Revenue\")\n",
    "        plt.legend(title=\"StoreID\")\n",
    "        plt.tight_layout()\n",
    "        plt.savefig(\"revenue_across_stores.png\")\n",
    "\n",
    "        return \"Revenue trends across stores plot saved to revenue_across_stores.png\"\n",
    "\n",
    "    yield FunctionInfo.from_fn(\n",
    "        _plot_revenue_across_stores,\n",
    "        description=(\n",
    "            \"This tool can be used to plot and compare the revenue trends across stores. Use this tool only if the \"\n",
    "            \"user asks for a comparison of revenue trends across stores.\"\n",
    "            \"It takes in a single string as input (which is ignored) and creates and saves an image of a plot of the revenue trends across stores.\"\n",
    "        ))\n",
    "\n",
    "\n",
    "class PlotAverageDailyRevenueConfig(FunctionBaseConfig, name=\"plot_average_daily_revenue\"):\n",
    "    \"\"\"Plot average daily revenue for stores and products.\"\"\"\n",
    "    data_path: str = Field(description=\"Path to the data file\")\n",
    "\n",
    "\n",
    "@register_function(config_type=PlotAverageDailyRevenueConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])\n",
    "async def plot_average_daily_revenue_function(config: PlotAverageDailyRevenueConfig, _builder: Builder):\n",
    "    \"\"\"Create a bar chart showing average daily revenue by day of week.\"\"\"\n",
    "    import matplotlib.pyplot as plt\n",
    "    import pandas as pd\n",
    "\n",
    "    df = pd.read_csv(config.data_path)\n",
    "\n",
    "    async def _plot_average_daily_revenue(arg: str) -> str:\n",
    "        daily_revenue = df.groupby([\"StoreID\", \"Product\", \"Date\"])[\"Revenue\"].sum().reset_index()\n",
    "\n",
    "        avg_daily_revenue = daily_revenue.groupby([\"StoreID\", \"Product\"])[\"Revenue\"].mean().unstack()\n",
    "\n",
    "        avg_daily_revenue.plot(kind=\"bar\", figsize=(12, 6), title=\"Average Daily Revenue per Store by Product\")\n",
    "        plt.ylabel(\"Average Revenue\")\n",
    "        plt.xlabel(\"Store ID\")\n",
    "        plt.xticks(rotation=0)\n",
    "        plt.legend(title=\"Product\", bbox_to_anchor=(1.05, 1), loc='upper left')\n",
    "        plt.tight_layout()\n",
    "        plt.savefig(\"average_daily_revenue.png\")\n",
    "\n",
    "        return \"Average daily revenue plot saved to average_daily_revenue.png\"\n",
    "\n",
    "    yield FunctionInfo.from_fn(\n",
    "        _plot_average_daily_revenue,\n",
    "        description=(\"This tool can be used to plot the average daily revenue for stores and products \"\n",
    "                     \"It takes in a single string as input and creates and saves an image of a grouped bar chart \"\n",
    "                     \"of the average daily revenue\"))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"register-tools\"></a>\n",
    "## 1.6) Register The Tools"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile -a retail_sales_agent_nb6/src/retail_sales_agent_nb6/register.py\n",
    "\n",
    "from . import sales_per_day_tool\n",
    "from . import detect_outliers_tool\n",
    "from . import total_product_sales_data_tool\n",
    "from . import retail_sales_rag_tool\n",
    "from . import data_visualization_tools"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "KAGE-pJ_OZ_P"
   },
   "source": [
    "<a id=\"workflow-config\"></a>\n",
    "## 1.7) Workflow Configuration File\n",
    "\n",
    "The following cell creates a basic workflow configuration file"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile retail_sales_agent_nb6/configs/config.yml\n",
    "llms:\n",
    "  nim_llm:\n",
    "    _type: nim\n",
    "    model_name: meta/llama-3.3-70b-instruct\n",
    "    temperature: 0.0\n",
    "    max_tokens: 2048\n",
    "    context_window: 32768\n",
    "    api_key: $NVIDIA_API_KEY\n",
    "\n",
    "embedders:\n",
    "  nim_embedder:\n",
    "    _type: nim\n",
    "    model_name: nvidia/nv-embedqa-e5-v5\n",
    "    truncate: END\n",
    "    api_key: $NVIDIA_API_KEY\n",
    "\n",
    "functions:\n",
    "  total_product_sales_data:\n",
    "    _type: get_total_product_sales_data\n",
    "    data_path: data/retail_sales_data.csv\n",
    "  sales_per_day:\n",
    "    _type: get_sales_per_day\n",
    "    data_path: data/retail_sales_data.csv\n",
    "  detect_outliers:\n",
    "    _type: detect_outliers_iqr\n",
    "    data_path: data/retail_sales_data.csv\n",
    "\n",
    "  data_analysis_agent:\n",
    "    _type: tool_calling_agent\n",
    "    tool_names:\n",
    "      - total_product_sales_data\n",
    "      - sales_per_day\n",
    "      - detect_outliers\n",
    "    llm_name: nim_llm\n",
    "    max_history: 10\n",
    "    max_iterations: 15\n",
    "    description: |\n",
    "      A helpful assistant that can answer questions about the retail sales CSV data.\n",
    "      Use the tools to answer the questions.\n",
    "      Input is a single string.\n",
    "    verbose: false\n",
    "\n",
    "  product_catalog_rag:\n",
    "    _type: retail_sales_rag\n",
    "    llm_name: nim_llm\n",
    "    embedder_name: nim_embedder\n",
    "    collection_name: product_catalog_rag\n",
    "    data_dir: data/rag/\n",
    "    description: \"Search product catalog for TabZen tablet, AeroBook laptop, NovaPhone specifications\"\n",
    "\n",
    "  rag_agent:\n",
    "    _type: react_agent\n",
    "    llm_name: nim_llm\n",
    "    tool_names: [product_catalog_rag]\n",
    "    max_history: 3\n",
    "    max_iterations: 5\n",
    "    max_retries: 2\n",
    "    description: |\n",
    "      An assistant that can only answer questions about products.\n",
    "      Use the product_catalog_rag tool to answer questions about products.\n",
    "      Do not make up any information.\n",
    "    verbose: false\n",
    "\n",
    "  plot_sales_trend_for_stores:\n",
    "    _type: plot_sales_trend_for_stores\n",
    "    data_path: data/retail_sales_data.csv\n",
    "  plot_and_compare_revenue_across_stores:\n",
    "    _type: plot_and_compare_revenue_across_stores\n",
    "    data_path: data/retail_sales_data.csv\n",
    "  plot_average_daily_revenue:\n",
    "    _type: plot_average_daily_revenue\n",
    "    data_path: data/retail_sales_data.csv\n",
    "\n",
    "  data_visualization_agent:\n",
    "    _type: react_agent\n",
    "    llm_name: nim_llm\n",
    "    tool_names:\n",
    "      - plot_sales_trend_for_stores\n",
    "      - plot_and_compare_revenue_across_stores\n",
    "      - plot_average_daily_revenue\n",
    "    max_history: 10\n",
    "    max_iterations: 15\n",
    "    description: |\n",
    "      You are a data visualization expert.\n",
    "      You can only create plots and visualizations based on user requests.\n",
    "      Only use available tools to generate plots.\n",
    "      You cannot analyze any data.\n",
    "    verbose: false\n",
    "    handle_parsing_errors: true\n",
    "    max_retries: 2\n",
    "    retry_parsing_errors: true\n",
    "\n",
    "workflow:\n",
    "  _type: react_agent\n",
    "  tool_names: [data_analysis_agent, data_visualization_agent, rag_agent]\n",
    "  llm_name: nim_llm\n",
    "  verbose: true\n",
    "  handle_parsing_errors: true\n",
    "  max_retries: 2\n",
    "  system_prompt: |\n",
    "    Answer the following questions as best you can.\n",
    "    You may communicate and collaborate with various experts to answer the questions.\n",
    "\n",
    "    {tools}\n",
    "\n",
    "    You may respond in one of two formats.\n",
    "    Use the following format exactly to communicate with an expert:\n",
    "\n",
    "    Question: the input question you must answer\n",
    "    Thought: you should always think about what to do\n",
    "    Action: the action to take, should be one of [{tool_names}]\n",
    "    Action Input: the input to the action (if there is no required input, include \"Action Input: None\")\n",
    "    Observation: wait for the expert to respond, do not assume the expert's response\n",
    "\n",
    "    ... (this Thought/Action/Action Input/Observation can repeat N times.)\n",
    "    Use the following format once you have the final answer:\n",
    "\n",
    "    Thought: I now know the final answer\n",
    "    Final Answer: the final answer to the original input question"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "9ugVMpgoSlb_"
   },
   "source": [
    "<a id=\"verify-tools\"></a>\n",
    "## 1.8) Testing/Verifying Workflow Installation\n",
    "\n",
    "You can verify the workflow was successfully set up by running the following example:\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!nat run --config_file retail_sales_agent_nb6/configs/config.yml \\\n",
    "  --input \"What is the Ark S12 Ultra tablet and what are its specifications?\" \\\n",
    "  --input \"How do laptop sales compare to phone sales?\" \\\n",
    "  --input \"Plot average daily revenue\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ItzxNviJof2Q"
   },
   "source": [
    "<a id=\"observe-workflow\"></a>\n",
    "# 2.0) Observing a Workflow with Phoenix\n",
    "\n",
    "> **Note:** _This portion of the example will only work when the notebook is run locally. It may not work through Google Colab and other online notebook environments._"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "b-7r6YUhOWAs"
   },
   "source": [
    "Phoenix is an open-source observability platform designed for monitoring, debugging, and improving LLM applications and AI agents. It provides a web-based interface for visualizing and analyzing traces from LLM applications, agent workflows, and ML pipelines. Phoenix automatically captures key metrics such as latency, token usage, and costs, and displays the inputs and outputs at each step, making it invaluable for debugging complex agent behaviors and identifying performance bottlenecks in AI workflows."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "drjEt3WkyK8l"
   },
   "source": [
    "<a id=\"update-config\"></a>\n",
    "## 2.1) Updating the Workflow Configuration For Telemetry\n",
    "\n",
    "We will need to update the workflow configuration file to support telemetry tracing with Phoenix."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hF8z4R1Vyr4_"
   },
   "source": [
    "To do this, we will first copy the original configuration:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cp retail_sales_agent_nb6/configs/config.yml retail_sales_agent_nb6/configs/phoenix_config.yml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "cBuWIqYHyzhJ"
   },
   "source": [
    "Then we will append necessary configuration components to the `phoenix_config.yml` file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile -a retail_sales_agent_nb6/configs/phoenix_config.yml\n",
    "\n",
    "general:\n",
    "  telemetry:\n",
    "    logging:\n",
    "      console:\n",
    "        _type: console\n",
    "        level: WARN\n",
    "    tracing:\n",
    "      phoenix:\n",
    "        _type: phoenix\n",
    "        endpoint: http://localhost:6006/v1/traces\n",
    "        project: retail_sales_agent_nb6\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kzGYACji_eh3"
   },
   "source": [
    "<a id=\"start-phoenix\"></a>\n",
    "## 2.2) Start Phoenix Server"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, we will install Phoenix:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!uv pip install arize-phoenix"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Uk0fRgMY6RX9"
   },
   "source": [
    "Then, we will ensure the service is publicly accessible:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%env PHOENIX_HOST=0.0.0.0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "e2ajQ08B9jGG"
   },
   "source": [
    "Finally, we will start the server:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%bash --bg\n",
    "# phoenix will run on port 6006\n",
    "phoenix serve"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pCScuDXVziTi"
   },
   "source": [
    "<a id=\"rerun-workflow\"></a>\n",
    "## 2.3) Rerun the Workflow\n",
    "\n",
    "Instead of the original workflow configuration, we will run with the updated `phoenix_config.yml` file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!nat run --config_file retail_sales_agent_nb6/configs/phoenix_config.yml \\\n",
    "  --input \"What is the Ark S12 Ultra tablet and what are its specifications?\" \\\n",
    "  --input \"How do laptop sales compare to phone sales?\" \\\n",
    "  --input \"Plot average daily revenue\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Ka6DC7YC-JbJ"
   },
   "source": [
    "<a id=\"view-trace\"></a>\n",
    "## 2.3) Viewing the trace\n",
    "\n",
    "You can access the Phoenix server at http://localhost:6006"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "j8q7dYytOqX4"
   },
   "source": [
    "<a id=\"eval-workflow\"></a>\n",
    "# 3.0) Evaluating a Workflow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "hci41nsrhgo6"
   },
   "source": [
    "After setting up observability, the next step is to evaluate your workflow's performance against a test dataset. NAT provides a powerful evaluation framework that can assess your agent's responses using various metrics and evaluators.\n",
    "\n",
    "For detailed information on evaluation, please refer to the [Evaluating NVIDIA NeMo Agent Toolkit Workflows](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/evaluate.html).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "vO9wbpgNhgo6"
   },
   "source": [
    "<a id=\"eval-dataset\"></a>\n",
    "## 3.1) Create an Evaluation Dataset\n",
    "\n",
    "For evaluating this workflow, we will created a sample dataset.\n",
    "\n",
    "The dataset will contain three test cases covering different query types. Each entry contains a question and the expected answer that the agent should provide.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile retail_sales_agent_nb6/data/eval_data.json\n",
    "[\n",
    "    {\n",
    "        \"id\": \"1\",\n",
    "        \"question\": \"How do laptop sales compare to phone sales?\",\n",
    "        \"answer\": \"Phone sales are higher than laptop sales in terms of both revenue and units sold. Phones generated a revenue of 561,000 with 1,122 units sold, whereas laptops generated a revenue of 512,000 with 512 units sold.\"\n",
    "    },\n",
    "    {\n",
    "        \"id\": \"2\",\n",
    "        \"question\": \"What is the Ark S12 Ultra tablet and what are its specifications?\",\n",
    "        \"answer\": \"The Ark S12 Ultra Ultra tablet features a 12.9-inch OLED display with a 144Hz refresh rate, HDR10+ dynamic range, and a resolution of 2800 x 1752 pixels. It has a contrast ratio of 1,000,000:1. The device is powered by Qualcomm's Snapdragon 8 Gen 3 SoC, which includes an Adreno 750 GPU and an NPU for on-device AI tasks. It comes with 16GB LPDDR5X RAM and 512GB of storage, with support for NVMe expansion via a proprietary magnetic dock. The tablet has a 11200mAh battery that enables up to 15 hours of typical use and recharges to 80 percent in 45 minutes via 45W USB-C PD. Additionally, it features a 13MP main sensor and a 12MP ultra-wide front camera, microphone arrays with beamforming, Wi-Fi 7, Bluetooth 5.3, and optional LTE/5G with eSIM. The device runs NebulynOS 6.0, based on Android 14L, and supports app sandboxing, multi-user profiles, and remote device management. It also includes the Pluma Stylus 3 with magnetic charging, 4096 pressure levels, and tilt detection, as well as a SnapCover keyboard with a trackpad and programmable shortcut keys.\"\n",
    "    },\n",
    "    {\n",
    "        \"id\": \"3\",\n",
    "        \"question\": \"What were the laptop sales on Feb 16th 2024?\",\n",
    "        \"answer\": \"On February 16th, 2024, the total laptop sales were 13 units, generating a total revenue of $13,000.\"\n",
    "    }\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "FWxbhiB9SK8K"
   },
   "source": [
    "<a id=\"update-config-again\"></a>\n",
    "## 3.2) Updating the Workflow Configuration\n",
    "\n",
    "Workflow configuration files can contain extra settings relevant for evaluation and profiling."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "v7QmbGpvUDkZ"
   },
   "source": [
    "To do this, we will first copy the original configuration:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cp retail_sales_agent_nb6/configs/config.yml retail_sales_agent_nb6/configs/config_eval.yml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Gsrj4FUSUDka"
   },
   "source": [
    "*Then* we will append necessary configuration components to the `config_eval.yml` file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile -a retail_sales_agent_nb6/configs/config_eval.yml\n",
    "\n",
    "eval:\n",
    "  general:\n",
    "    output_dir: ./eval_output\n",
    "    verbose: true\n",
    "    dataset:\n",
    "        _type: json\n",
    "        file_path: ./retail_sales_agent_nb6/data/eval_data.json\n",
    "\n",
    "  evaluators:\n",
    "    accuracy:\n",
    "      _type: ragas\n",
    "      metric: AnswerAccuracy\n",
    "      llm_name: nim_llm\n",
    "    groundedness:\n",
    "      _type: ragas\n",
    "      metric: ResponseGroundedness\n",
    "      llm_name: nim_llm\n",
    "    relevance:\n",
    "      _type: ragas\n",
    "      metric: ContextRelevance\n",
    "      llm_name: nim_llm\n",
    "    trajectory_accuracy:\n",
    "      _type: trajectory\n",
    "      llm_name: nim_llm\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kpr0vte_hgo6"
   },
   "source": [
    "<a id=\"run-eval\"></a>\n",
    "## 3.3) Running the Evaluation\n",
    "\n",
    "The `nat eval` command executes the workflow against all entries in the dataset and evaluates the results using configured evaluators. Run the cell below to evaluate the retail sales agent workflow.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!nat eval --config_file retail_sales_agent_nb6/configs/config_eval.yml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "1hM9ObwXhgo7"
   },
   "source": [
    "<a id=\"understand-eval\"></a>\n",
    "## 3.4) Understanding Evaluation Results\n",
    "\n",
    "The `nat eval` command runs the workflow on all entries in the dataset and produces several output files:\n",
    "\n",
    "- **`workflow_output.json`**: Contains the raw outputs from the workflow for each input in the dataset\n",
    "- **Evaluator-specific files**: Each configured evaluator generates its own output file with scores and reasoning\n",
    "\n",
    "#### Evaluation Scores\n",
    "\n",
    "Each evaluator provides:\n",
    "- An **average score** across all dataset entries (0-1 scale, where 1 is perfect)\n",
    "- **Individual scores** for each entry with detailed reasoning\n",
    "- **Performance metrics** to help identify areas for improvement\n",
    "\n",
    "All evaluation results are stored in the `output_dir` specified in the configuration file.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ouCqR1daVg59"
   },
   "source": [
    "<a id=\"profile-workflow\"></a>\n",
    "## 4.0)  Profiling a Workflow"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "P06nWQI6hgo7"
   },
   "source": [
    "Profiling provides deep insights into your workflow's performance characteristics, helping you identify bottlenecks, optimize resource usage, and improve overall efficiency.\n",
    "\n",
    "For detailed information on profiling, please refer to the [Profiling and Performance Monitoring of NVIDIA NeMo Agent Toolkit Workflows](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/profiler.html).\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "kBUu8wVzYT93"
   },
   "source": [
    "<a id=\"update-profiling-workflow\"></a>\n",
    "## 4.1) Updating the Workflow Configuration\n",
    "\n",
    "Workflow configuration files can contain extra settings relevant for evaluation and profiling."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IREct15KYT94"
   },
   "source": [
    "To do this, we will first copy the original configuration:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!cp retail_sales_agent_nb6/configs/config.yml retail_sales_agent_nb6/configs/config_profile.yml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8iONd0KTYT94"
   },
   "source": [
    "*Then* we will append necessary configuration components to the `config_profile.yml` file:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%writefile -a retail_sales_agent_nb6/configs/config_profile.yml\n",
    "\n",
    "eval:\n",
    "  general:\n",
    "    output_dir: ./profile_output\n",
    "    verbose: true\n",
    "    dataset:\n",
    "        _type: json\n",
    "        file_path: ./retail_sales_agent_nb6/data/eval_data.json\n",
    "\n",
    "    profiler:\n",
    "        token_uniqueness_forecast: true\n",
    "        workflow_runtime_forecast: true\n",
    "        compute_llm_metrics: true\n",
    "        csv_exclude_io_text: true\n",
    "        prompt_caching_prefixes:\n",
    "          enable: true\n",
    "          min_frequency: 0.1\n",
    "        bottleneck_analysis:\n",
    "          enable_nested_stack: true\n",
    "        concurrency_spike_analysis:\n",
    "          enable: true\n",
    "          spike_threshold: 7\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nNLdnyc1hgo7"
   },
   "source": [
    "<a id=\"understand-profiler-config\"></a>\n",
    "## 4.2) Understanding the Profiler Configuration\n",
    "\n",
    "We will reuse the same configuration as evaluation.\n",
    "\n",
    "The profiler is configured through the `profiler` section of your workflow configuration file. It runs alongside the `nat eval` command and offers several analysis options:\n",
    "\n",
    "#### Key Configuration Options:\n",
    "\n",
    "- **`token_uniqueness_forecast`**: Computes the inter-query token uniqueness forecast, predicting the expected number of unique tokens in the next query based on tokens used in previous queries\n",
    "\n",
    "- **`workflow_runtime_forecast`**: Calculates the expected workflow runtime based on historical query performance\n",
    "\n",
    "- **`compute_llm_metrics`**: Computes inference optimization metrics including latency, throughput, and other performance indicators\n",
    "\n",
    "- **`csv_exclude_io_text`**: Prevents large text from being dumped into output CSV files, preserving CSV structure and readability\n",
    "\n",
    "- **`prompt_caching_prefixes`**: Identifies common prompt prefixes that can be pre-populated in KV caches for improved performance\n",
    "\n",
    "- **`bottleneck_analysis`**: Analyzes workflow performance measures such as bottlenecks, latency, and concurrency spikes\n",
    "  - `simple_stack`: Provides a high-level analysis\n",
    "  - `nested_stack`: Offers detailed analysis of nested bottlenecks (e.g., tool calls inside other tool calls)\n",
    "\n",
    "- **`concurrency_spike_analysis`**: Identifies concurrency spikes in your workflow. The `spike_threshold` parameter (e.g., 7) determines when to flag spikes based on the number of concurrent running functions\n",
    "\n",
    "#### Output Directory\n",
    "\n",
    "The `output_dir` parameter specifies where all profiler outputs will be stored for later analysis.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "A1wwbC_Lhgo7"
   },
   "source": [
    "<a id=\"run-profiler\"></a>\n",
    "## 4.3) Running the Profiler\n",
    "\n",
    "The profiler runs as part of the `nat eval` command. When properly configured, it will collect performance data across all evaluation runs and generate comprehensive profiling reports.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!nat eval --config_file retail_sales_agent_nb6/configs/config_profile.yml"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "FvwCiUrqaOaf"
   },
   "source": [
    "<a id=\"understand-profiler-output-files\"></a>\n",
    "## 4.4) Understanding Profiler Output Files\n",
    "\n",
    "Based on the profiler configuration, the following files will be generated in the `output_dir`:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "_YFrbGAWhgo7"
   },
   "source": [
    "**Core Output Files:**\n",
    "\n",
    "1. **`all_requests_profiler_traces.json`**: Raw usage statistics collected by the profiler, including:\n",
    "   - Raw traces of LLM interactions\n",
    "   - Tool input and output data\n",
    "   - Runtime measurements\n",
    "   - Execution metadata\n",
    "\n",
    "2. **`inference_optimization.json`**: Workflow-specific performance metrics with confidence intervals:\n",
    "   - 90%, 95%, and 99% confidence intervals for latency\n",
    "   - Throughput statistics\n",
    "   - Workflow runtime predictions\n",
    "\n",
    "3. **`standardized_data_all.csv`**: Standardized usage data in CSV format containing:\n",
    "   - Prompt tokens and completion tokens\n",
    "   - LLM input/output\n",
    "   - Framework information\n",
    "   - Additional metadata\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QMcdVjOZaQkD"
   },
   "source": [
    "**Advanced Analysis Files**\n",
    "\n",
    "4. **Analysis Reports**: JSON files and text reports for any advanced techniques enabled:\n",
    "   - Concurrency analysis results\n",
    "   - Bottleneck analysis reports\n",
    "   - PrefixSpan pattern mining results\n",
    "\n",
    "These files provide comprehensive insights into your workflow's performance and can be used for optimization and debugging."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Bf7ICQLiaTje"
   },
   "source": [
    "**Gantt Chart**\n",
    "\n",
    "We can also view a Gantt chart of the profile run:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from IPython.display import Image\n",
    "\n",
    "Image(\"profile_output/gantt_chart.png\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iIUhnLt-hgo7"
   },
   "source": [
    "<a id=\"summary\"></a>\n",
    "# 5.0) Notebook Summary\n",
    "\n",
    "In this notebook, we covered the complete workflow for observability, evaluation, and profiling in NeMo Agent Toolkit:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "sSo0k_JUatEe"
   },
   "source": [
    "**Observability with Phoenix**\n",
    "- Configured tracing in the workflow configuration\n",
    "- Started the Phoenix server for real-time monitoring\n",
    "- Executed workflows with automatic trace capture\n",
    "- Visualized agent execution flow and LLM interactions\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "QsVf_g5Qaxbe"
   },
   "source": [
    "**Evaluation with `nat eval`**\n",
    "- Created a comprehensive evaluation dataset\n",
    "- Ran automated evaluations across multiple test cases\n",
    "- Reviewed evaluation metrics and scores\n",
    "- Analyzed workflow performance against expected outputs\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "qCn1i8ghazKp"
   },
   "source": [
    "**Profiling for Performance Optimization**\n",
    "- Configured advanced profiling options\n",
    "- Collected performance metrics and usage statistics\n",
    "- Generated detailed profiling reports\n",
    "- Identified bottlenecks and optimization opportunities\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0dF8JoWda0bl"
   },
   "source": [
    "These three pillars—observability, evaluation, and profiling—work together to provide a complete picture of your agent's behavior, accuracy, and performance, enabling you to build production-ready AI applications with confidence."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<a id=\"next-steps\"></a>\n",
    "# 6.0) Next steps\n",
    "\n",
    "Continue learning with the next notebook in our series: `optimize_model_selection.ipynb` where we will demonstrate how `nat optimize` can be used to identify the best set of models, parameters, and prompts, for your use case."
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
